Rewind filehandle request bodies before retrying requests #1444

jwodder · 2024-05-22T23:25:19Z

Closes #1408.

Here's what was going on: Occasionally, requests to S3 to upload a Zarr entry fail with a 500 status due to an internal error on S3's end, causing dandi-cli to retry the request. When it retries the request, it calls session.request() again, passing in the same arguments as before, which include an open filehandle for reading the Zarr entry from disk. However, the filehandle was already read to the end on the initial request (the one that resulted in a 500); thus, although requests's super_len() obtains the correct value for the file's length, it then subtracts the filehandle's current position (the end of the file) from this length to get the number of bytes that would be produced by reading from the current position to the end of file: zero — and, as previously established, when super_len() returns 0, requests falls back to "chunked" transfer encoding, which S3 responds to with a 501 error about "header implies functionality not implemented" (hereafter "HIFNI").

The logs for Error during upload, "A header you provided implies functionality that is not implemented" #1033, the original report of the HIFNI problem, are no longer available, so I cannot check if this was happening there as well.
This was what caused the initial HIFNI in Errors while uploading #1257, based on the logs in this comment (I didn't check the other error logs posted later in the thread).

While this patch should eliminate most instances of the HIFNI problem, it is still conceivable that the original hypothesized cause — NFS erroneously reporting filesizes as zero — could occur. @yarikoptic How much of the previously-added infrastructure for dealing with this problem should we keep around after this?

codecov · 2024-05-22T23:28:40Z

Codecov Report

Attention: Patch coverage is 50.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 88.61%. Comparing base (722f1b6) to head (1910e8a).
Report is 54 commits behind head on master.

Files	Patch %	Lines
dandi/dandiapi.py	50.00%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1444   +/-   ##
=======================================
  Coverage   88.61%   88.61%           
=======================================
  Files          77       77           
  Lines       10563    10565    +2     
=======================================
+ Hits         9360     9362    +2     
  Misses       1203     1203

Flag	Coverage Δ
unittests	`88.61% <50.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

yarikoptic · 2024-05-23T15:13:34Z

Well, this is quite a serious digging! Thank you @jwodder!

I think we should keep the machinery available for now since IIRC there should be no performance hit, but it might still come handy happen we run into such a situation again for one reason or another.

Should we close #1408 altogether with this PR or you think there would be more?

yarikoptic · 2024-05-23T15:14:44Z

dandi/dandiapi.py

@@ -233,6 +233,8 @@ def request(
                                url,
                                result.text,
                            )
+                            if data is not None and hasattr(data, "seek"):
+                                data.seek(0)


could/should we may be add a test which would trigger such a case by e.g. shimming self.session.request and always failing upon first try after reading the file?

responses doesn't seem to support requests where data is a filehandle: getsentry/responses#719

ok, if no easy way to test ATM, I am ok to proceed to see it fixed and others to give it a shot. Adding release label as well.

jwodder · 2024-05-23T15:30:21Z

@yarikoptic

Should we close #1408 altogether with this PR or you think there would be more?

Unless you manage to get the error again with this patch (and the run on smaug is currently on its seventh upload without any problems), we can probably close it.

yarikoptic · 2024-05-23T20:07:02Z

should we just undraft and merge it?

jwodder · 2024-05-23T20:12:53Z

@yarikoptic Undrafted.

github-actions · 2024-05-23T22:23:20Z

🚀 PR was released in 0.62.1 🚀

Rewind filehandle request bodies before retrying requests

1910e8a

jwodder added patch Increment the patch version when merged cmd-upload zarr labels May 22, 2024

jwodder requested a review from yarikoptic May 22, 2024 23:25

jwodder mentioned this pull request May 22, 2024

zarr re-upload might still be suffering from "A header you provided implies functionality that is not implemented" #1408

Closed

jwodder added the HIFNI Zarr uploads failing with "A header you provided implies functionality that is not implemented" label May 22, 2024

jwodder marked this pull request as ready for review May 23, 2024 12:39

yarikoptic reviewed May 23, 2024

View reviewed changes

jwodder marked this pull request as draft May 23, 2024 15:30

jwodder mentioned this pull request May 23, 2024

Errors while uploading #1257

Open

yarikoptic added the release Create a release when this pr is merged label May 23, 2024

jwodder marked this pull request as ready for review May 23, 2024 20:12

yarikoptic merged commit d9abf1d into master May 23, 2024
27 of 28 checks passed

yarikoptic deleted the rewind branch May 23, 2024 22:22

github-actions bot added the released label May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewind filehandle request bodies before retrying requests #1444

Rewind filehandle request bodies before retrying requests #1444

jwodder commented May 22, 2024 •

edited

Loading

codecov bot commented May 22, 2024 •

edited

Loading

yarikoptic commented May 23, 2024

yarikoptic May 23, 2024

jwodder May 23, 2024

yarikoptic May 23, 2024

jwodder commented May 23, 2024

yarikoptic commented May 23, 2024

jwodder commented May 23, 2024

github-actions bot commented May 23, 2024

Rewind filehandle request bodies before retrying requests #1444

Rewind filehandle request bodies before retrying requests #1444

Conversation

jwodder commented May 22, 2024 • edited Loading

codecov bot commented May 22, 2024 • edited Loading

Codecov Report

yarikoptic commented May 23, 2024

yarikoptic May 23, 2024

Choose a reason for hiding this comment

jwodder May 23, 2024

Choose a reason for hiding this comment

yarikoptic May 23, 2024

Choose a reason for hiding this comment

jwodder commented May 23, 2024

yarikoptic commented May 23, 2024

jwodder commented May 23, 2024

github-actions bot commented May 23, 2024

jwodder commented May 22, 2024 •

edited

Loading

codecov bot commented May 22, 2024 •

edited

Loading